1,654 research outputs found

    PIntron: a Fast Method for Gene Structure Prediction via Maximal Pairings of a Pattern and a Text

    Full text link
    Current computational methods for exon-intron structure prediction from a cluster of transcript (EST, mRNA) data do not exhibit the time and space efficiency necessary to process large clusters of over than 20,000 ESTs and genes longer than 1Mb. Guaranteeing both accuracy and efficiency seems to be a computational goal quite far to be achieved, since accuracy is strictly related to exploiting the inherent redundancy of information present in a large cluster. We propose a fast method for the problem that combines two ideas: a novel algorithm of proved small time complexity for computing spliced alignments of a transcript against a genome, and an efficient algorithm that exploits the inherent redundancy of information in a cluster of transcripts to select, among all possible factorizations of EST sequences, those allowing to infer splice site junctions that are highly confirmed by the input data. The EST alignment procedure is based on the construction of maximal embeddings that are sequences obtained from paths of a graph structure, called Embedding Graph, whose vertices are the maximal pairings of a genomic sequence T and an EST P. The procedure runs in time linear in the size of P, T and of the output. PIntron, the software tool implementing our methodology, is able to process in a few seconds some critical genes that are not manageable by other gene structure prediction tools. At the same time, PIntron exhibits high accuracy (sensitivity and specificity) when compared with ENCODE data. Detailed experimental data, additional results and PIntron software are available at http://www.algolab.eu/PIntron

    ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences

    Get PDF
    BACKGROUND: Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs) to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems – hence the need to develop novel strategies. RESULTS: We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions) due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion). It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. CONCLUSION: Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at

    ASPIC: a web resource for alternative splicing prediction and transcript isoforms characterization

    Get PDF
    Alternative splicing (AS) is now emerging as a major mechanism contributing to the expansion of the transcriptome and proteome complexity of multicellular organisms. The fact that a single gene locus may give rise to multiple mRNAs and protein isoforms, showing both major and subtle structural variations, is an exceptionally versatile tool in the optimization of the coding capacity of the eukaryotic genome. The huge and continuously increasing number of genome and transcript sequences provides an essential information source for the computational detection of genes AS pattern. However, much of this information is not optimally or comprehensively used in gene annotation by current genome annotation pipelines. We present here a web resource implementing the ASPIC algorithm which we developed previously for the investigation of AS of user submitted genes, based on comparative analysis of available transcript and genome data from a variety of species. The ASPIC web resource provides graphical and tabular views of the splicing patterns of all full-length mRNA isoforms compatible with the detected splice sites of genes under investigation as well as relevant structural and functional annotation. The ASPIC web resource—available at —is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility

    Genetic parameters of fatty acids in Italian Brown Swiss and Holstein cows

    Get PDF
    The aim of this study was to estimate the genetic parameters and to predict experimental breeding values (EBVs) for saturated (SFA), unsaturated (UFA), monounsaturated (MUFA) and polyunsaturated (PUFA) fatty acids, the ratio of fatty acids, and the productive traits in Italian Brown Swiss (BSW) and Holstein Friesian (HOL) cattle. Test-day yields from 235,658 HOL and 21,723 BSW cows were extracted from the Italian HOL and BSW Associations databases from November 2009 to October 2012 out of 3310 herds. The milk samples collected within the routine milk recording scheme were processed with the MilkoscanTM FT 6500 Plus (Foss, Hillerød, Denmark) for the identification of SFA, UFA, MUFA and PUFA composition in milk. Genetic parameters for fatty acids and productive traits were estimated on 1,765,552 records in HOL and 255,592 records in BSW. Heritability values estimated for SFA, UFA, MUFA and PUFA ranged from 0.06 to 0.18 for the BSW breed and from 0.10 to 0.29 for HOL. The genetic trends for the fatty acids were consistent between traits and breeds. Pearson's and Spearman's correlations among EBVs for SFA, UFA, MUFA and PUFA and official EBVs for fat percentage were in the range 0.32 to 0.54 for BSW and 0.44 to 0.64 for HOL. The prediction of specific EBVs for milk fatty acids and for the ratio among them may be useful to identify the best bulls to be selected with the aim to improve milk quality in terms of fat content and fatty acid ratios, achieving healthier dairy productions for consumers

    International nonproprietary names for monoclonal antibodies: an evolving nomenclature system

    Get PDF
    Appropriate nomenclature for all pharmaceutical substances is important for clinical development, licensing, prescribing, pharmacovigilance, and identification of counterfeits. Nonproprietary names that are unique and globally recognized for all pharmaceutical substances are assigned by the International Nonproprietary Names (INN) Programme of the World Health Organization (WHO). In 1991, the INN Programme implemented the first nomenclature scheme for monoclonal antibodies. To accompany biotechnological development, this nomenclature scheme has evolved over the years; however, since the scheme was introduced, all pharmacological substances that contained an immunoglobulin variable domain were coined with the stem -mab. To date, there are 879 INN with the stem -mab. Owing to this high number of names ending in -mab, devising new and distinguishable INN has become a challenge. The WHO INN Expert Group therefore decided to revise the system to ease this situation. The revised system was approved and adopted by the WHO at the 73rd INN Consultation held in October 2021, and the radical decision was made to discontinue the use of the well-known stem -mab in naming new antibody-based drugs and going forward, to replace it with four new stems: -tug, -bart, -mig, and -ment
    • …
    corecore